Hierarchy Through Composition with Multitask LMDPs
نویسندگان
چکیده
Hierarchical architectures are critical to the scalability of reinforcement learning methods. Most current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme exploits the guaranteed concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.
منابع مشابه
Supplementary Material for Hierarchy Through Composition with Multitask LMDPs
A key difference between the LMDP and standard MDPs is the fact that the cost function must include the KL term penalizing deviation from the ‘passive dynamics.’ While this may seem limiting, most domains of interest have some notion of ‘efficient’ actions, making a control cost a reasonably natural and universal phenomenon. Indeed, we suggest that for many real-world domains the standard MDP f...
متن کاملHierarchical Linearly-Solvable Markov Decision Problems
We present a hierarchical reinforcement learning framework that formulates each task in the hierarchy as a special type of Markov decision process for which the Bellman equation is linear and has analytical solution. Problems of this type, called linearly-solvable MDPs (LMDPs) have interesting properties that can be exploited in a hierarchical setting, such as efficient learning of the optimal ...
متن کاملHierarchy through Composition with Linearly Solvable Markov Decision Processes
Hierarchical architectures are critical to the scalability of reinforcement learning methods. Current hierarchical frameworks execute actions serially, with macroactions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme uses the concurrent compositionality provided by the l...
متن کاملRelational Feature Mining with Hierarchical Multitask kFOIL
We introduce hierarchical kFOIL as a simple extension of the multitask kFOIL learning algorithm. The algorithm first learns a core logic representation common to all tasks, and then refines it by specialization on a per-task basis. The approach can be easily generalized to a deeper hierarchy of tasks. A task clustering algorithm is also proposed in order to automatically generate the task hiera...
متن کاملGeneralized Dictionary for Multitask Learning with Boosting
While multitask learning has been extensively studied, most existing methods rely on linear models (e.g. linear regression, logistic regression), which may fail in dealing with more general (nonlinear) problems. In this paper, we present a new approach that combines dictionary learning with gradient boosting to achieve multitask learning with general (nonlinear) basis functions. Specifically, f...
متن کامل